Technical Report No: BU-CE-1001 A Discretization Method based on Maximizing the Area Under ROC Curve

نویسندگان

  • Murat Kurtcephe
  • H. Altay Güvenir
چکیده

We present a new discretization method based on Area under ROC Curve (AUC) measure. Maximum Area under ROC Curve Based Discretization (MAD) is a global, static and supervised discretization method. It discretizes a continuous feature in a way that the AUC based only on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as Entropy-MDLP (Minimum Description Length Principle) which is known as one of the best discretization methods, Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD are proposed recently and designed for naïve Bayes learning. Evaluations are performed in terms of M-Measure, an AUC based metric for multi-class classification, and accuracy values obtained from naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real world datasets. Empirical results show that our method is a candidate to be a good alternative to other discretization methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Discretization Method Based on Maximizing the Area under Receiver Operating Characteristic Curve

Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD...

متن کامل

Risk Estimation by Maximizing the Area under ROC Curve

Risks exist in many different domains; medical diagnoses, financial markets, fraud detection and insurance policies are some examples. Various risk measures and risk estimation systems have hitherto been proposed and this paper suggests a new risk estimation method. Risk estimation by maximizing the area under a receiver operating characteristics (ROC) curve (REMARC) defines risk estimation as ...

متن کامل

Maximizing the Area under the ROC Curve using Incremental Reduced Error Pruning

The use of incremental reduced error pruning for maximizing the area under the ROC curve (AUC) instead of accuracy is investigated. A commonly used accuracy-based exclusion criterion is shown to include rules that result in concave ROC curves as well as to exclude rules that result in convex ROC curves. A previously proposed exclusion criterion for unordered rule sets, based on the lift, is on ...

متن کامل

Which combination of MR imaging modalities is best for predicting recurrent glioblastoma? Study of diagnostic accuracy and reproducibility.

PURPOSE To compare the added value of dynamic contrast material-enhanced ( CE contrast enhanced ) ( DCE dynamic CE ) magnetic resonance (MR) imaging with that of dynamic susceptibility CE contrast enhanced ( DSC dynamic susceptibility CE ) MR imaging with the combination of CE contrast enhanced T1-weighted imaging and diffusion-weighted ( DW diffusion weighted ) imaging for predicting recurrent...

متن کامل

Score Fusion by Maximizing the Area under the ROC Curve

Information fusion is currently a very active research topic aimed at improving the performance of biometric systems. This paper proposes a novel method for optimizing the parameters of a score fusion model based on maximizing an index related to the Area Under the ROC Curve. This approach has the convenience that the fusion parameters are learned without having to specify the client and impost...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010